Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks

نویسندگان

Nicolas Usunier

Gabriel Synnaeve

Zeming Lin

Soumith Chintala

چکیده

We consider scenarios from the real-time strategy game StarCraft as new benchmarks for reinforcement learning algorithms. We propose micromanagement tasks, which present the problem of the short-term, low-level control of army members during a battle. From a reinforcement learning point of view, these scenarios are challenging because the stateaction space is very large, and because there is no obvious feature representation for the state-action evaluation function. We describe our approach to tackle the micromanagement scenarios with deep neural network controllers from raw state features given by the game engine. In addition, we present a heuristic reinforcement learning algorithm which combines direct exploration in the policy space and backpropagation. This algorithm allows for the collection of traces for learning using deterministic policies, which appears much more efficient than, for example, -greedy exploration. Experiments show that with this algorithm, we successfully learn non-trivial strategies for scenarios with armies of up to 15 agents, where both Q-learning and REINFORCE struggle.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Episodic Exploration for Deep Deterministic Policies for Starcraft Micromanagement

We consider scenarios from the real-time strategy game StarCraft as benchmarks for reinforcement learning algorithms. We focus on micromanagement, that is, the short-term, low-level control of team members during a battle. We propose several scenarios that are challenging for reinforcement learning algorithms because the stateaction space is very large, and there is no obvious feature represent...

متن کامل

Neuroevolution for Micromanagement in the Real-Time Strategy Game Starcraft: Brood War

Real-Time Strategy (RTS) games have become an attractive domain for AI research in recent years, due to their dynamic, multi-agent and multi-objective environments. Micromanagement, a core component of many RTS games, involves the control of multiple agents to accomplish goals that require fast, real time assessment and reaction. In this paper, we present the application and evaluation of a Neu...

متن کامل

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

In many real-world settings, a team of agents must coordinate their behaviour while acting in a decentralised way. At the same time, it is often possible to train the agents in a centralised fashion in a simulated or laboratory setting, where global state information is available and communication constraints are lifted. Learning joint actionvalues conditioned on extra state information is an a...

متن کامل

Bayesian Networks for Micromanagement Decision Imitation in the RTS Game Starcraft

Real time strategy (RTS) games provide various research areas for Artificial Intelligence. One of these areas involves the management of either individual or small group of units, called micromanagement. This research provides an approach that implements an imitation of the player’s decisions as a mean for micromanagement combat in the RTS game Starcraft. A bayesian network is generated to fit ...

متن کامل

Constructing Stochastic Mixture Policies for Episodic Multiobjective Reinforcement Learning Tasks

Multiobjective reinforcement learning algorithms extend reinforcement learning techniques to problems with multiple conflicting objectives. This paper discusses the advantages gained from applying stochastic policies to multiobjective tasks and examines a particular form of stochastic policy known as a mixture policy. Two methods are proposed for deriving mixture policies for episodic multiobje...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

CoRR

دوره abs/1609.02993 شماره

صفحات -

تاریخ انتشار 2016

Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks

نویسندگان

چکیده

منابع مشابه

Episodic Exploration for Deep Deterministic Policies for Starcraft Micromanagement

Neuroevolution for Micromanagement in the Real-Time Strategy Game Starcraft: Brood War

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

Bayesian Networks for Micromanagement Decision Imitation in the RTS Game Starcraft

Constructing Stochastic Mixture Policies for Episodic Multiobjective Reinforcement Learning Tasks

عنوان ژورنال:

اشتراک گذاری